Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: florencev2 fine-tuning meta tool #190

Merged
merged 12 commits into from
Aug 7, 2024
Merged

Conversation

Dayof
Copy link
Member

@Dayof Dayof commented Aug 6, 2024

Motivation

This meta tool enables the possibility to fine-tune the florencev2 model using one of the tasks: CAPTION, CAPTION_TO_PHRASE_GROUNDING and OBJECT_DETECTION.
florencev2_fine_tuning calls LandingAI public api /v1/agent/jobs/fine-tuning which is responsible to launch a fine-tuning job in LandingAI environment.
It will be possible to check the job status using the fine-tune job id, and also run inference with it.

Local test

OBJECT_DETECTION

img_path = sys.argv[1]
agent_gpt4t = VisionAgent(verbosity=2)
prompt = """fine-tune florencev2 with the following bounding boxes: 
[{'image': 'cereal_3.jpg', 'labels': ['screw'], 'bboxes': [(713, 1363, 909, 1567)]}]"""
agent_gpt4t(prompt, img_path)

Code generated:

from typing import *
from vision_agent.utils.execute import CodeInterpreter
from vision_agent.tools.meta_tools import generate_vision_code, edit_vision_code, open_file, create_file, scroll_up, scroll_down, edit_file, get_tool_descriptions, florencev2_fine_tuning

florencev2_fine_tuning([{'image_path': '/home/dayoff/Downloads/cereal_shankar/cereal/train/cereal_3.jpg', 'labels': ['screw'], 'bboxes': [[713, 1363, 909, 1567]]}], 'OBJECT_DETECTION')

CAPTION

img_path = sys.argv[1]
agent_gpt4t = VisionAgent(verbosity=2)
prompt = """fine-tune florencev2 to be able to caption images, use the following bounding boxes to fine-tune: 
[{'image': 'cereal_3.jpg', 'labels': ['screw'], 'bboxes': [(713, 1363, 909, 1567)]}]"""
agent_gpt4t(prompt, img_path)

Code generated:

from typing import *
from vision_agent.utils.execute import CodeInterpreter
from vision_agent.tools.meta_tools import generate_vision_code, edit_vision_code, open_file, create_file, scroll_up, scroll_down, edit_file, get_tool_descriptions, florencev2_fine_tuning

florencev2_fine_tuning([{'image_path': '/home/dayoff/Downloads/cereal_shankar/cereal/train/cereal_3.jpg', 'labels': ['screw'], 'bboxes': [(713, 1363, 909, 1567)]}], 'CAPTION')

CAPTION_TO_PHRASE_GROUNDING

img_path = sys.argv[1]
agent_gpt4t = VisionAgent(verbosity=2)
prompt = """fine-tune florencev2 to be able to turn caption to phrase grounding, use the following bounding boxes to fine-tune: 
[{'image': 'cereal_3.jpg', 'labels': ['screw'], 'bboxes': [(713, 1363, 909, 1567)]}]"""
agent_gpt4t(prompt, img_path)

Code generated:

from typing import *
from vision_agent.utils.execute import CodeInterpreter
from vision_agent.tools.meta_tools import generate_vision_code, edit_vision_code, open_file, create_file, scroll_up, scroll_down, edit_file, get_tool_descriptions, florencev2_fine_tuning

florencev2_fine_tuning([{'image_path': '/home/dayoff/Downloads/cereal_shankar/cereal/train/cereal_3.jpg', 'labels': ['screw'], 'bboxes': [(713, 1363, 909, 1567)]}], 'CAPTION_TO_PHRASE_GROUNDING')

Extras

  • Add pre-commit

@Dayof Dayof changed the title feat: florencev2 fine-tuning feat: florencev2 fine-tuning meta tool Aug 7, 2024
Copy link
Member

@dillonalaird dillonalaird left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good!

@Dayof Dayof merged commit ae97907 into main Aug 7, 2024
8 checks passed
@Dayof Dayof deleted the feat/florence-fine-tuning branch August 7, 2024 20:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants